BLEU+: a Tool for Fine-Grained BLEU Computation

نویسندگان

A. Cüneyd Tantug

Kemal Oflazer

Ilknur Durgar El-Kahlout

چکیده

We present a tool, BLEU+, which implements various extension to BLEU computation to allow for a better understanding of the translation performance, especially for morphologically complex languages. BLEU+ takes into account both “closeness” in morphological structure, “closeness” of the root words in the WordNet hierarchy while comparing tokens in the candidate and reference sentence. In addition to gauging performance at a finer level of granularity, BLEU+ also allows the computation of various upper bound oracle scores: comparing all tokens considering only the roots allows us to get an upper bound when all errors due to morphological structure are fixed, while comparing tokens in an error-tolerant way considering minor morpheme edit operations, allows us to get a (more realistic) upper bound when tokens that differ in morpheme insertions/deletions and substitutions are fixed. We use BLEU+ in the fine-grained evaluation of the output of our English-to-Turkish statistical MT system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-grained attention mechanism for neural machine translation

Neural machine translation (NMT) has been a new paradigm in machine translation, and the attention mechanism has become the dominant approach with the state-of-the-art records in many language pairs. While there are variants of the attention mechanism, all of them use only temporal attention where one scalar value is assigned to one context vector corresponding to a source word. In this paper, ...

متن کامل

Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models

In this work, we propose two extensions of standard word lexicons in statistical machine translation: A discriminative word lexicon that uses sentence-level source information to predict the target words and a trigger-based lexicon model that extends IBM model 1 with a second trigger, allowing for a more fine-grained lexical choice of target words. The models capture dependencies that go beyond...

متن کامل

ParFDA for Instance Selection for Statistical Machine Translation

We build parallel feature decay algorithms (ParFDA) Moses statistical machine translation (SMT) systems for all language pairs in the translation task at the first conference on statistical machine translation (Bojar et al., 2016a) (WMT16). ParFDA obtains results close to the top constrained phrase-based SMT with an average of 2.52 BLEU points difference using significantly less computation for...

متن کامل

A Class-Based Agreement Model for Generating Accurately Inflected Translations

When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes ...

متن کامل

Multiword Expressions in the Context of Statistical Machine Translation

Incorporating semantic information in the Statistical Machine Translation (SMT) framework is starting to gain some popularity in both the semantics and translation communities. In this paper, we present encouraging results obtained from experiments conducted on English to Arabic SMT system using static, dynamic, and hybrid integration of fine-grained Multiword Expression (MWE). We achieve an im...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

BLEU+: a Tool for Fine-Grained BLEU Computation

نویسندگان

چکیده

منابع مشابه

Fine-grained attention mechanism for neural machine translation

Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models

ParFDA for Instance Selection for Statistical Machine Translation

A Class-Based Agreement Model for Generating Accurately Inflected Translations

Multiword Expressions in the Context of Statistical Machine Translation

عنوان ژورنال:

اشتراک گذاری